Let $A\in\mathbb{R}^{m\times n}$ and $B\in\mathbb{R}^{n\times p}$. Then,
$$ C=AB=\sum_{i=1}^{n} A_{:,i} B_{i,:}. $$Assume, for simplicity $m=n=p$.
Idea: approximate $C$ by randomly sampling $r$ rank-one components.
Algorithm: for $l= 1,\cdots ,r$ pick $i_l\in\{1,···,n\}$ i.i.d. with probability $\mathbb{P}\{i_l=k\}=p_k$ and compute
$$M=\sum_{l=1}^r \frac{1}{rp_{i_l}} A_{:,i_l} B_{i_l,:}$$Rationale: $M$ is an unbiased estimate of $C$,
\begin{align*} \mathbb{E}[M]&=\sum_{l=1}^r \sum_k \mathbb{P}\{i_l=k\} \frac{1}{r p_k} A_{:,k}B_{k,:} =\sum_k A_{:,k}B_{k,:}=C. \end{align*}Importance sampling porbabilities $p_k$ are
Uniform sampling: $p_k=\displaystyle\frac{1}{n}$
Nonuniform sampling:
which is computable in one-pass and requires $O(n)$ memory and $O(n^2)$ operations.
Theorem. [Optimality] $\mathbb{E}[\|M-AB\|_F^2]$ is minimized for $p_k$ given by (1).
Theorem. [Error] Choose $p_k\geq \displaystyle\frac{\beta \|A_{:,k}\|_2 \|B_{k,:}\|_2} {\sum_l \|A_{:,l}\|_2 \|B_{l,:}\|_2}$ for some $0<\beta \leq 1$. If $r\geq \displaystyle\frac{\log n}{\beta}$, then
$$\|M-AB\|_F\leq \sqrt{\frac{\log n}{\beta r}} \|A\|_F \|B\|_F $$with probability exceeding $1-O(n^{-10})$.
In [1]:
using Distributions
using Random
Random.seed!(1345)
using LinearAlgebra
using SparseArrays
In [2]:
n=3000
A=rand(n,n)
B=rand(n,n)
C=A*B
β=1.0
log(n)/β
Out[2]:
In [3]:
# Uniform
r=400
iᵣ=rand(1:n,r)
p=1/n
@time M=A[:,iᵣ]*B[iᵣ,:]/(r*p)
@time C=A*B;
In [4]:
norm(M-C),sqrt(log(n)/(β*r))*norm(A)*norm(B), norm(C)
Out[4]:
In [5]:
# Nonuniform
pA=[norm(view(A,:,k)) for k=1:n]
pB=[norm(view(B,k,:)) for k=1:n]
s=pA⋅pB
p=[pA[k]*pB[k]/s for k=1:n]
sum(p)
Out[5]:
In [6]:
iᵣ=rand(Categorical(p),r);
@time Mₙ=A[:,iᵣ]*inv(Diagonal(r*p[iᵣ]))*B[iᵣ,:]
norm(Mₙ-C),sqrt(log(n)/(β*r))*norm(A)*norm(B), norm(C)
Out[6]:
In [7]:
# Sparse, nonuniform
n=10000
A=sprand(n,n,0.1)
B=sprand(n,n,0.1)
@time C=A*B
β=1.0
log(n)/β
Out[7]:
In [8]:
# C is full
nnz(C)/prod(size(C))
Out[8]:
In [9]:
# Nonuniform
pA=[norm(view(A,:,k)) for k=1:n]
pB=[norm(view(B,k,:)) for k=1:n]
s=pA⋅pB
p=[pA[k]*pB[k]/s for k=1:n]
sum(p)
Out[9]:
In [10]:
r=1000
iᵣ=rand(Categorical(p),r);
@time Mₙ=A[:,iᵣ]*inv(Diagonal(r*p[iᵣ]))*B[iᵣ,:]
norm(Mₙ-C),sqrt(log(n)/(β*r))*norm(A)*norm(B), norm(C)
Out[10]:
In [ ]: